AITopics | teacher signal

f6876a9f998f6472cc26708e27444456-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 03:48:46 GMT

extension, hebbian plasticity, teacher signal, (11 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area (0.31)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback

f6876a9f998f6472cc26708e27444456-AuthorFeedback.pdf

Neural Information Processing SystemsAug-17-2025, 07:57:29 GMT

We thank all reviewers for their thoughtful comments. "The method is only compared to prior models with long-term memory on the [QA] task, and doesn't perform as " This is expected as these are ML models with non-biological Our goal was to show that simple local Hebbian plasticity can be utilized to solve many of these tasks. "Is it essential that the key-value Our goal was to show that simple local plasticity is sufficient for many tasks. "How and why do the query and storage keys "[...] isn't it possible to achieve good performance on the tasks in the paper This approach is rather close to the approach of MemN2N. "[...] it would be helpful to explain the practical or physiological relevance in more detail.

artificial intelligence, hebbian plasticity, machine learning, (13 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area (0.31)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback

Review for NeurIPS paper: H-Mem: Harnessing synaptic plasticity with Hebbian Memory Networks

Neural Information Processing SystemsFeb-8-2025, 08:35:36 GMT

The motivation of the model is unclear. In other words, why can this model work on the two tasks? We cannot simply say it uses Hebbian rule which agrees with biological system then it should work. A reason, or intuition, from the perspective of machine learning should be provided. I want to see explanations on both tasks in the rebuttal.

harnessing synaptic plasticity, hebbian memory network, teacher signal, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Temporal Knowledge Sharing enable Spiking Neural Network Learning from Past and Future

Dong, Yiting, Zhao, Dongcheng, Zeng, Yi

arXiv.org Artificial IntelligenceOct-23-2023

Spiking Neural Networks (SNNs) have attracted significant attention from researchers across various domains due to their brain-like information processing mechanism. However, SNNs typically grapple with challenges such as extended time steps, low temporal information utilization, and the requirement for consistent time step between testing and training. These challenges render SNNs with high latency. Moreover, the constraint on time steps necessitates the retraining of the model for new deployments, reducing adaptability. To address these issues, this paper proposes a novel perspective, viewing the SNN as a temporal aggregation model. We introduce the Temporal Knowledge Sharing (TKS) method, facilitating information interact between different time points. TKS can be perceived as a form of temporal self-distillation. To validate the efficacy of TKS in information processing, we tested it on static datasets like CIFAR10, CIFAR100, ImageNet-1k, and neuromorphic datasets such as DVS-CIFAR10 and NCALTECH101. Experimental results demonstrate that our method achieves state-of-the-art performance compared to other algorithms. Furthermore, TKS addresses the temporal consistency challenge, endowing the model with superior temporal generalization capabilities. This allows the network to train with longer time steps and maintain high performance during testing with shorter time steps. Such an approach considerably accelerates the deployment of SNNs on edge devices. Finally, we conducted ablation experiments and tested TKS on fine-grained tasks, with results showcasing TKS's enhanced capability to process information efficiently.

information, neural network, time step, (13 more...)

arXiv.org Artificial Intelligence

2304.0654

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

FreeLM: Fine-Tuning-Free Language Model

Li, Xiang, Jiang, Xin, Meng, Xuying, Sun, Aixin, Wang, Yequan

arXiv.org Artificial IntelligenceMay-2-2023

Pre-trained language models (PLMs) have achieved remarkable success in NLP tasks. Despite the great success, mainstream solutions largely follow the pre-training then finetuning paradigm, which brings in both high deployment costs and low training efficiency. Nevertheless, fine-tuning on a specific task is essential because PLMs are only pre-trained with language signal from large raw data. In this paper, we propose a novel fine-tuning-free strategy for language models, to consider both language signal and teacher signal. Teacher signal is an abstraction of a battery of downstream tasks, provided in a unified proposition format. Trained with both language and strong task-aware teacher signals in an interactive manner, our FreeLM model demonstrates strong generalization and robustness. FreeLM outperforms large models e.g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments. FreeLM is much smaller with 0.3B parameters, compared to 175B in these models.

freelm, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.01616

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

General Cross-Architecture Distillation of Pretrained Language Models into Matrix Embeddings

Galke, Lukas, Cuber, Isabelle, Meyer, Christoph, Nölscher, Henrik Ferdinand, Sonderecker, Angelina, Scherp, Ansgar

arXiv.org Artificial IntelligenceJul-28-2022

Large pretrained language models (PreLMs) are revolutionizing natural language processing across all benchmarks. However, their sheer size is prohibitive for small laboratories or for deployment on mobile devices. Approaches like pruning and distillation reduce the model size but typically retain the same model architecture. In contrast, we explore distilling PreLMs into a different, more efficient architecture, Continual Multiplication of Words (CMOW), which embeds each word as a matrix and uses matrix multiplication to encode sequences. We extend the CMOW architecture and its CMOW/CBOW-Hybrid variant with a bidirectional component for more expressive power, per-token representations for a general (task-agnostic) distillation during pretraining, and a two-sequence encoding scheme that facilitates downstream tasks on sentence pairs, such as sentence similarity and natural language inference. Our matrix-based bidirectional CMOW/CBOW-Hybrid model is competitive to DistilBERT on question similarity and recognizing textual entailment, but uses only half of the number of parameters and is three times faster in terms of inference speed. We match or exceed the scores of ELMo for all tasks of the GLUE benchmark except for the sentiment analysis task SST-2 and the linguistic acceptability task CoLA. However, compared to previous cross-architecture distillation approaches, we demonstrate a doubling of the scores on detecting linguistic acceptability. This shows that matrix-based embeddings can be used to distill large PreLM into competitive models and motivates further research in this direction.

distillation, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2109.08449

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback

Learning to classify complex patterns using a VLSI network of spiking neurons

Mitra, Srinjoy, Indiveri, Giacomo, Fusi, Stefano

Neural Information Processing SystemsDec-31-2008

We propose a compact, low power VLSI network of spiking neurons which can learn to classify complex patterns of mean firing rates online and in real-time. The network of integrate-and-fire neurons is connected by bistable synapses that can change their weight using a local spike-based plasticity mechanism. Learning is supervised by a teacher which provides an extra input to the output neurons during training. The synaptic weights are updated only if the current generated by the plastic synapses does not match the output desired by the teacher (as in the perceptron learning rule). We present experimental results that demonstrate how this VLSI network is able to robustly classify uncorrelated linearly separable spatial patterns of mean firing rates.

mean firing rate, neuron, synapse, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Industry: Semiconductors & Electronics (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.85)

Add feedback

Learning to classify complex patterns using a VLSI network of spiking neurons

Mitra, Srinjoy, Indiveri, Giacomo, Fusi, Stefano

Neural Information Processing SystemsDec-31-2008

We propose a compact, low power VLSI network of spiking neurons which can learn to classify complex patterns of mean firing rates online and in real-time. The network of integrate-and-fire neurons is connected by bistable synapses that can change their weight using a local spike-based plasticity mechanism. Learning is supervised by a teacher which provides an extra input to the output neurons during training. The synaptic weights are updated only if the current generated by the plastic synapses does not match the output desired by the teacher (as in the perceptron learning rule). We present experimental results that demonstrate how this VLSI network is able to robustly classify uncorrelated linearly separable spatial patterns of mean firing rates.

mean firing rate, neuron, synapse, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Industry: Semiconductors & Electronics (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.85)

Add feedback

Learning to classify complex patterns using a VLSI network of spiking neurons

Mitra, Srinjoy, Indiveri, Giacomo, Fusi, Stefano

Neural Information Processing SystemsDec-31-2008

We propose a compact, low power VLSI network of spiking neurons which can learn to classify complex patterns of mean firing rates online and in real-time. The network of integrate-and-fire neurons is connected by bistable synapses that can change their weight using a local spike-based plasticity mechanism. Learning is supervised by a teacher which provides an extra input to the output neurons during training. The synaptic weights are updated only if the current generated by the plastic synapses does not match the output desired by the teacher (as in the perceptron learning rule). We present experimental results that demonstrate how this VLSI network is able to robustly classify uncorrelated linearly separable spatial patterns of mean firing rates.

artificial intelligence, machine learning, pattern recognition, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Semiconductors & Electronics (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.85)

Add feedback

Sigma-Pi Learning: On Radial Basis Functions and Cortical Associative Learning

Mel, Bartlett W., Koch, Christof

Neural Information Processing SystemsDec-31-1990

The goal in this work has been to identify the neuronal elements of the cortical column that are most likely to support the learning of nonlinear associative maps. We show that a particular style of network learning algorithm based on locally-tuned receptive fields maps naturally onto cortical hardware, and gives coherence to a variety of features of cortical anatomy, physiology, and biophysics whose relations to learning remain poorly understood.

learning, pyramidal cell, receptive field, (13 more...)

Neural Information Processing Systems

Country: